Configure a pool of servers with centralized logs aggregation using Debian, Docker and Rsyslog

I was recently tasked with upgrading the servers of a small company. After defining the requirements and thinking about the architecture of the whole system, I decided to document the setup here as a guide, both as a reference for my future self (my memory works in a if-I-didn’t-write-it-down-it-didn’t-happen kind of way) and hopefully to help anyone trying to achieve similar results.

Don’t take everything written here as a promise of absolute best-practice. What I describe here fits well for my use case, but you might prefer other solutions, such as using Ansible to create the nodes, or a more modern log aggregation solution such as the ELK stack. This article will also be fairly long, so feel free to jump to the sections that are relevant to you and only take inspiration to build your own setup. I’ll try to link to the source documentations as much as possible to help you customize the configuration files to your need and help you make your own choices. If you think I made a mistake or you have suggestions to improve this setup, please leave a comment.

1. Requirements

The servers need to host all the services needed by the company, which include miscellaneous but pretty common things such as a Postfix/Dovecot mail server, a WordPress website, and internal webapps. The services to run may change on a regular basis, and I’d like to be able to migrate to another host provider as easily as possible if need be.

These services also need to be as isolated as possible in case one of them gets compromised (looking at you, WordPress). Some services are managed by external contractors which must be isolated from the other services.

The setup below will emphasize these two key requirements : flexibility, and isolation.

2. Architecture

2.1 Containerization

Docker is a good solution to clearly define the boundaries of each service, and isolate them from one another. It also makes it a lot easier to move services and refactor the architecture as needed. I’ll try to keep the base OS installation and configuration relatively minimal (or at least, standard), and run all services as Docker containers.

This goes toward both our key requirements : containers can be moved around more easily than bare-metal packages with configuration files and data all over the system, and they are relatively isolated from each other.

However, some services will need even a greater level of isolation :

  • because they contain sensitive information (some internal webapps),
  • because they run apps with a higher risk of being compromised (WordPress with some dubious plugins),
  • or because they will be managed by an external contractor that requires restricted access.

2.2 Nodes

For this reason, I’ll separate the architecture into multiple VPS’s (virtual servers), that I will call nodes. All servers will share the same base domain name, that I will name mydomain.com 1 here, prefixed with the server’s node number. So the first server will be named n1.mydomain.com, and so on. Tying the name to the server (node) itself and not semantically to a service name means I will have more flexibility to add new nodes and move services around in the future.

N1 will be my central node, and will not contain any user-facing services. It will be used as a main server to centralize the information about and control over the other nodes and their services. It will have a slightly different configuration from the other servers. The other nodes (N2, N3, …) will share the same base configuration, and only differ by the Docker services that they run, again to make things as flexible and scalable as possible.

1 To make it easier to follow and adapt to your case, I will highlight in a different color the placeholders that need to be customized in the commands and config files.

3. Implementation

3.1 Hosts provider

All the servers will be hosted at Gandi, as GandiCloud VPS instances (not sponsored, just an FYI), and based on Debian 12 Bookworm.

3.2 Private network

They will be connected together via a private network, instantiated using Gandi’s web interface. This network is in the 192.168.1.0/24 range. This makes it easier to securely connect the nodes together, for instance by opening the ports of administrative processes only on the private network’s interface. However, if you would like to do the same thing but your provider doesn’t offer this feature, you can achieve similar results using appropriate an firewall configuration.

3.3 Centralized logging

I want to centralize the logs of all the nodes on N1. This will make it easier to troubleshoot issues, and safer in case one node is compromised or unresponsive, or if a contractor with root access on its dedicated node does something fishy and tries to cover their traces.

In order to prevent issues in case a large amount of logs suddenly fill up the storage, the logs will be stored in a separate volume mounted at /logs. They will be categorized by hostname following this file structure : /logs/<hostname>/<facility>/<process>.log1. For example : /logs/n1.mydomain.com/auth/sshd.log.

1 If you are not familiar with rsyslog, the “facility” can be seen as the category that the log belongs in : auth, kern, mail, daemon, …


Enough talk, let’s dive in. Most of the configuration will be common to all nodes, so we’ll use N1 as an example. I will make it clear in the few cases where there are differences between N1 and the other nodes.

4. Nodes configuration

4.1 DNS

4.2 System updates

4.3 Hostname

4.4 Date, time and timezone

4.5 Private network interface

4.6 SSH

4.7 Shell

4.8 Firewall

4.9 Automatic system updates

4.10 System emails

4.11 Logs

4.12 Fail2ban

4.13 Logcheck

Now that everything is setup, it’s a good idea to spend a few minutes looking at the logs to make sure everything looks right :

$ lnav "root@n1.mydomain.com:/logs/*/*"

You should see events from all your nodes, including probably :

  • Some failed SSH connection attempts :
    n1.mydomain.com sshd[xxxxxxx]: Invalid user guest from X.X.X.X port X
  • Fail2ban doings its job :
    n3.mydomain.com fail2ban[xxxxxx] [sshd] Found X.X.X.X
  • The firewall blocking connections : kernel [UFW BLOCK] ...

That’s it for the system’s configuration ! Take a break, stretch your legs, before diving into the last part : installing Docker and running services inside containers.

5 Docker

In order to make things simpler to manage and to move around, all the Docker services will be centralized in /srv, with a dedicated subdirectory for each service containing at least a docker-compose.yml definition file for that particular service, as well as the persistent volumes, Dockerfile, or any other resource necessary. A global docker-compose.yml file in /srv will simply include each subdirectory, making it easy to add or remove a service : a single line needs to be added or removed (or commented).

5.1 Installation

5.2 Environment variables

5.3 Internal services

We’ll install two services that will be common to all nodes and serve as a basis for the other services : Traefik as a front-end proxy, and Portainer to centralize management of the services on all nodes. As always, customize to your needs.

5.2.1 Traefik

5.2.2 Portainer

5.4 User-facing services

As an example of an external service to run with Docker, we will install a WordPress website. This requires a database, so we’ll install a MariaDB server as well. And for good measure, we’ll also add a PHPMyAdmin interface.

5.4.1 MariaDB

5.4.2 PHPMyAdmin

5.4.3 WordPress

5.4.4 Monitoring Docker services with Fail2Ban

If you want to read more about best practices for Docker containers security, take a look at this article : https://www.linuxserver.io/blog/docker-security-practices.

5.5 Main docker-compose file

6. Backups

Conclusion

If everything is setup correctly, you should now have :

  • a Traefik dashboard on https://traefik.n1.mydomain.com/dashboard/,
  • a Portainer dashboard on https://portainer.mydomain.com/,
  • the logs of every node and every service in /logs on N1, that you can monitor from your local machine using
    $ lnav "root@n1.mydomain.com:/logs/*/*"

If you do, congratulations, your servers are working properly !

Leave a Reply

Your email address will not be published. Required fields are marked *